Comparative study of GMM, DTW, and ANN on Thai speaker identification system
نویسندگان
چکیده
This paper proposes a new investigation on Gaussian mixture model (GMM) by comparing it with some preliminary experiments on multilayered perceptron network (MLP) with backpropagation learning algorithm (BKP) and dynamic time warping (DTW) techniques on Thai text-dependent speaker identification system. Three major identification engines are conducted on 50 speakers with isolated digits 0-9. Training and testing utterances were recorded over a five week duration. Furthermore, three well-known speech features, namely linear predictive coding derived cepstrum (LPCC), postfiltered ceptrum (PFL), and Mel frequency cepstral coefficient (MFCC) were evaluated. From our previous experiments, the MFCC has given the highest identification rates on DTW and MLP. Therefore, GMM with MFCC feature was experimented and attained 87.54% average identification accuracy, as opposed to 86.74% of DTW and 82.34% of MLP. The results are the same with top-3 concatenated digits, the average identification rates are 99%, 98.70 %, and 97.30% for GMM, DTW, and MLP, respectively.
منابع مشابه
Comparative Study of Continuous Hidden Markov Models (CHMM) and Artificial Neural Network (ANN) on Speaker Identification System
This paper reports a comparative study between continuous hidden Markov model (CHMM) and artificial neural network (ANN) on text dependent, closed set speaker identification (SID) system with Thai language recording in office environment. Thai isolated digit 0-9 and their concatenation are used as speaking text. Mel frequency cepstral coefficients (MFCC) are selected as the studied features. Tw...
متن کاملImprovement of speaker verification for Thai language
There are many strategies proposed for speaker verification (SV) system, both in text-dependent (fixed-text) and textindependent (free-text) domains. To convey an appropriate algorithm for Thai speech, several consecutively improvement methods are compared in this paper including the dynamic time warping (DTW) matching and Gaussian mixture model (GMM) based systems. We firstly developed a syste...
متن کاملComparative Study of Speaker Recognition Methods: DTW, GMM and SVM
Speaker recognition is a process where a person is recognized on the basis of his/her voice signals. The problem of speaker recognition belongs to a much broader topic in scientific and engineering so called pattern classification. In this paper we provide a brief overview for evolution of pattern classification technique used in speaker recognition. We also discussed about our propose process ...
متن کاملSpeaker Identification Using Gaussian Mixture Models
In this paper, the performance of Perceptual Linear Prediction (PLP) features has been compared with the performance of Linear Prediction Coefficient (LPC) features for speaker identification. Two classification techniques, Gaussian Mixture Models (GMM) and Vector Quantization (VQ) with Dynamic time wrapping (DTW) are used for classification of speakers based on their speech samples into respec...
متن کاملLinear and non-linear fusion of ALISP-based and GMM systems for text-independent speaker verification
Current state-of-the-art speaker verification algorithms use Gaussian Mixture Models (GMM) to estimate the probability density function of the acoustic feature vectors. They are denoted here as global systems. In order to give better performance, they have to be combined with other classifiers, using different fusion methods. The performance of the final classifier depend on the choice of the s...
متن کامل